Chinese Microblogs Sentiment Classification using Maximum Entropy

نویسندگان

  • Dashu Ye
  • Peijie Huang
  • Kaiduo Hong
  • Zhuoying Tang
  • Weijian Xie
  • Guilong Zhou
چکیده

This paper presents our Chinese microblog sentiment classification (CMSC) system in the Topic-Based Chinese Message Polarity Classification task of SIGHAN-8 Bake-Off. Given a message from Chinese Weibo platform and a topic, our system is designed to classify whether the message is of positive, negative, or neutral sentiment towards the given topic. Due to the difficulties like the out-ofvocabulary Internet words and emoticons, polarity classification of Chinese microblogs is still an open problem today. In our system, Maximum Entropy (MaxEnt) is employed, which is a discriminative model that directly models the class posteriors, allowing them to incorporate a rich set of features. Moreover, oversampling approach is used to hand the unbalance problem. Evaluation results demonstrate the utility of our system, showing an accuracy of 66.4% for restricted resource and 66.6% for unrestricted resource.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MHSubLex: Using Metaheuristic Methods for Subjectivity Classification of Microblogs

In Web 2.0, people are free to share their experiences, views, and opinions. One of the problems that arises in web 2.0 is the sentiment analysis of texts produced by users in outlets such as Twitter. One of main the tasks of sentiment analysis is subjectivity classification. Our aim is to classify the subjectivity of Tweets. To this end, we create subjectivity lexicons in which the words into ...

متن کامل

Reserved Self-training: A Semi-supervised Sentiment Classification Method for Chinese Microblogs

The imbalanced sentiment distribution of microblogs induces bad performance of binary classifiers on the minority class. To address this problem, we present a semisupervised method for sentiment classification of Chinese microblogs. This method is similar to self-training, except that, a set of labeled samples is reserved for a confidence scores computing process through which samples that are ...

متن کامل

Every Term Has Sentiment: Learning from Emoticon Evidences for Chinese Microblog Sentiment Analysis

Chinese microblog is a popular Internet social medium where users express their sentiments and opinions.But sentiment analysis onChinese microblogs is difficult: The lack of labeling on the sentiment polarities restricts many supervised algorithms; out-of-vocabulary words and emoticons enlarge the sentiment expressions, which are beyond traditional sentiment lexicons. In this paper, emoticons i...

متن کامل

Opinion Sentence Extraction and Sentiment Analysis for Chinese Microblogs

Sentiment analysis of Chinese microblogs is important for scientific research in public opinion supervision, personalized recommendation and social computing. By studying the evaluation task of NLP&CC’2012, we mainly implement two tasks, namely the extraction of opinion sentence and the determination of sentiment orientation for microblogs. First, we manually label the sample of microblog corpu...

متن کامل

Collective Opinion Target Extraction in Chinese Microblogs

Microblog messages pose severe challenges for current sentiment analysis techniques due to some inherent characteristics such as the length limit and informal writing style. In this paper, we study the problem of extracting opinion targets of Chinese microblog messages. Such fine-grained word-level task has not been well investigated in microblogs yet. We propose an unsupervised label propagati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015